On the Role of Nonlinear Transformations in Deep Neural Network Acoustic Models

نویسندگان

  • Tasha Nagamine
  • Michael L. Seltzer
  • Nima Mesgarani
چکیده

Deep neural networks (DNNs) are widely utilized for acoustic modeling in speech recognition systems. Through training, DNNs used for phoneme recognition nonlinearly transform the time-frequency representation of a speech signal into a sequence of invariant phonemic categories. However, little is known about how this nonlinear mapping is performed and what its implications are for the classification of individual phones and phonemic categories. In this paper, we analyze a sigmoid DNN trained for a phoneme recognition task and characterized several aspects of the nonlinear transformations that occur in hidden layers. We show that the function learned by deeper hidden layers becomes increasingly nonlinear, and that network selectively warps the feature space so as to increase the discriminability of acoustically similar phones, aiding in their classification. We also demonstrate that the nonlinear transformation of the feature space in deeper layers is more dedicated to the phone instances that are more difficult to discriminate, while the more separable phones are dealt with in the superficial layers of the network. This study describes how successive nonlinear transformations are applied to the feature space non-uniformly when a deep neural network model learns categorical boundaries, which may partly explain their superior performance in pattern classification applications.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Nonlinear Model Improves Stock Return Out of Sample Forecasting (Case Study: United State Stock Market)

Improving out-of-sample forecasting is one of the main issues in financial research. Previous studies have achieved this objective by increasing the number of input variables or changing the kind of input variables. Changing the forecasting model is another possible approach to improve out-of-sample forecasting. Most researches have focused on linear models, while few have studied nonlinear mod...

متن کامل

Verification of an Evolutionary-based Wavelet Neural Network Model for Nonlinear Function Approximation

Nonlinear function approximation is one of the most important tasks in system analysis and identification. Several models have been presented to achieve an accurate approximation on nonlinear mathematics functions. However, the majority of the models are specific to certain problems and systems. In this paper, an evolutionary-based wavelet neural network model is proposed for structure definiti...

متن کامل

بهبود مدل تفکیک‌کننده منیفلدهای غیرخطی به‌منظور بازشناسی چهره با یک تصویر از هر فرد

Manifold learning is a dimension reduction method for extracting nonlinear structures of high-dimensional data. Many methods have been introduced for this purpose. Most of these methods usually extract a global manifold for data. However, in many real-world problems, there is not only one global manifold, but also additional information about the objects is shared by a large number of manifolds...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016